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ENCODERS FOR BLOCK-CIRCULANT LDPC 
CODES 

GOVERNMENT INTEREST 

The invention described herein was made in the perfor- 
mance of work under a NASA contract, and is subject to the 
provisions of Public Law 96-517 (35 USC 202) in which the 
Contractor has elected to retain title. 

CROSS-REFERENCE TO RELATED 
APPLICATIONS 

This application is filed on the same day of U.S. Pat. App. 
Ser. No. 1 1,166,040, now U.S. Pat. No. 7,343,539, for “ARA 
Type Protograph Codes”, incorporated herein by reference in 
its entirety. 

BACKGROUND 

1. Field 

The present disclosure relates to encoders and encoding 
methods for block-circulant low-density parity-check 
(LDPC) codes. In particular, a first encoder and iterative 
encoding method are based on the erasure decoding algo- 
rithm. The computations required are well organized due to 
the block-circulant structure of the parity check matrix. A 
second encoder and method use block-circulant generator 
matrices. Some encoders of the second type have been imple- 
mented in a small Field Programmable Gate Array (FPGA) 
and can operate at 100 Msymbols/second. 

2. Related Art 

Recently, block-circulant LDPC codes have been found 
that provide both excellent error correction performance and 
well structured decoder architectures. Constructions have 
been presented in the following papers:. 

Y. Kou, H. Tang, S. Lin, and K. Abdel-Ghaffar, “On Circulant 
Low Density Parity Check Codes,” IEEE International 
Symposium on Information Theory, p. 200, June 2002; 

S. Lin, “Quasi -Cyclic LDPC Codes.” CCSDS working group 
white paper, Oct. 2003; 

R. M. Tanner, “On Graph Constructions for LDPC Codes by 
Quasi- Cyclic Extension,” in Information, Coding and 
Mathematics (M. Blaum, P. Farrell, and H. van Tilborg, 
eds.), pp. 209-220, Kluwer, June 2002; 

A. Sridharan, D. Costello, and R. M. Tanner, “A Construction 
for Low Density Parity Check Convolutional Codes Based 
on Quasi-Cyclic Block Codes,” in IEEE International 
Symposium on Information Theory, p. 481, June 2002; 

O. Milenkovic, I. Djordjevic, andB. Vasic, “Block-Circulant 
Low-Density Parity-Check Codes for Optical Communi- 
cation Systems,” IEEE Journal of Selected Topics in Quan- 
tum Electronics, pp. 294-299, March 2004; 

J. Thorpe, K. Andrews, and S. Dolinar, “Methodologies for 
Designing LDPC Codes Using Protographs and Circu- 
lants ,” in IEEE International Symposium on Information 
Theory, p. 238, June 2004), and others. 

All of the above papers are incorporated herein by reference 
in their entirety. 

Error correcting codes are used to transmit information 
reliably over an unreliable channel, such as a radio commu- 
nications link or a magnetic recording system. One class of 
error correcting codes are binary block codes, where K infor- 
mation bits are encoded into a codeword of N symbols (N>K), 
the codeword is transmitted over the channel, and a decoder 
then attempts to decode the received (and potentially cor- 
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rupted) symbols into the original K information bits. If the 
channel symbols are also binary, an encoder that uses the K 
information bits as K of the. N channel symbols is known as 
a systematic encoder. These K channel symbols are called the 
5 systematic symbols, and the remaining N-K symbols are 
called parity symbols. Sometimes one uses an encoder that 
generates N+P symbols, and then P of them are discarded 
while the remaining N are transmitted over the channel. The 
discarded symbols are known as punctured symbols, 
to Many different mathematical models are used to describe 
physical communications channels. One model is the Binary 
Erasure Channel (BEC). The input alphabet is binary (either 
0 or 1), and the output alphabet is ternary (0, 1, or e for 
erasure). When a 0 is transmitted over the BEC, the received 
15 symbol may be either 0 or e; similarly, a transmitted 1 is 
received either as a 1 or e. An erasure correcting decoder is 
used with a BEC, and its task is to reconstruct the binary 
values that were transmitted and corrupted to the value e by 
the channel. In particular, puncturing a codeword is equiva- 
20 lent to transmitting it over a BEC, where each punctured 
symbol is corrupted to the value e. 

Erasure correcting decoders for LDPC codes have been 
studied at length [see, for example, M. Luby, M. Mitzenma- 
cher, A. Shokrollahi, D. Spielman, andV. Stemann, “Practical 
25 loss-resilient codes,” in Proc. 29th Annual ACM Symp. 
Theory of Computing, 1997, pp. 150-159], and the decoding 
method described in that paper has become the standard era- 
sure correcting algorithm. This erasure correcting algorithm 
succeeds if and only if the erased symbol positions do not 
30 contain a stopping set [see T. Richardson and R. Urbanke, 
"Efficient Encoding of Low-Density Parity-Check Codes, 
IEE Trans, on Information Theory, February 2001, pp. 638- 
656]. 

U.S. Pub. App. No. 20040153934 discloses a method and 
35 apparatus for encoding LDPC codes. 

SUMMARY 

In accordance with the present disclosure, novel encoders, 
40 encoding methods and a hardware encoder implementation 
for block-circulant LDPC codes will be presented. 

According to a first aspect, an encoding apparatus to 
encode message input symbols in accordance with an accu- 
mulate-repeat-accumulate code with repetition four is dis- 
45 closed, the apparatus comprising: a first multiplier to multiply 
a first portion of the input symbols with a first matrix, forming 
first intermediate symbols; a second multiplier to multiply a 
second portion of the input symbols with a second matrix, 
forming second intermediate symbols; a first adder to sum the 
50 first intermediate symbols with the second intermediate sym- 
bols, forming third intermediate symbols; a third multiplier to 
multiply the third intermediate symbols with a third matrix, 
forming fourth intermediate symbols; a fourth multiplier to 
multiply the third intermediate symbols with a fourth matrix, 
55 forming a first set of output symbols; a second adder to sum 
the fourth intermediate symbols with the second portion of 
the input symbols, forming fifth intermediate symbols; a per- 
muter to permute the fifth intermediate symbols, forming 
permuted symbols; and an accumulator to accumulate the 
60 permuted symbols, forming a second set of output symbols. 

According to a second aspect, a method for encoding mes- 
sage input symbols in accordance with an accumulate-repeat- 
accumulate code with repetition four is disclosed, compris- 
ing: multiplying a first portion of the input symbols with a first 
65 matrix, forming first intermediate symbols; multiplying a 
second portion of the input symbols with a second matrix, 
forming second intermediate symbols; adding the first inter- 
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mediate symbols to the second intermediate symbols, form- 
ing third intermediate symbols; multiplying the third inter- 
mediate symbols with a third matrix, forming fourth 
intermediate symbols; multiplying the third intermediate 
symbols with a fourth matrix, forming a first set of output 
symbols; adding the fourth intermediate symbols with the 
input symbols, forming fifth intermediate symbols; permut- 
ing the fifth intermediate symbols, forming permuted sym- 
bols; and accumulating the permuted symbols, forming a 
second set of output symbols. 

According to a third aspect, an encoding apparatus to 
encode message input symbols in accordance with an accu- 
mulate-repeat-accumulate code with repetition three is dis- 
closed, the apparatus comprising: a puncturing device, punc- 
turing k input symbols and outputting k/2 input symbols, 
forming a first set of output symbols; a first multiplier to 
multiply the k input symbols with a first matrix, forming first 
intermediate symbols; a second multiplier to multiply the k 
input symbols with a second matrix, forming a second set of 
output symbols; a permuter to permute the first intermediate 
symbols, forming permuted symbols; and an accumulator to 
accumulate the permuted symbols, forming a third set of 
output symbols. 

According to a fourth aspect, a method for encoding mes- 
sage input symbols in accordance with an accumulate-repeat- 
accumulate code with repetition three is disclosed, compris- 
ing: puncturing k input symbols and outputting k/2 input 
symbols, forming a first set of output symbols; multiplying 
the k input symbols with a first matrix, forming first interme- 
diate symbols; multiplying the k input symbols with a second 
matrix, forming a second set of output symbols; permuting 
the first intermediate symbols, forming permuted symbols; 
and accumulating the permuted symbols, forming a third set 
of output symbols. 

According to a fifth aspect, an encoding apparatus to 
encode input symbols in accordance with a block-circulant 
LDPC code is disclosed, the apparatus comprising: a plurality 
of recursive convolutional encoders, each recursive convolu- 
tional encoder comprising storage units, multipliers and 
adders to encode the input symbols; and a plurality of circu- 
lant patterns to be fed to the recursive convolutional encoders, 
one set of patterns for each recursive convolutional encoder. 

According to a sixth aspect, a method for encoding input 
symbols in accordance with a block-circulant LDPC code is 
disclosed, comprising: providing a plurality of recursive con- 
volutional encoders, each recursive convolutional encoder 
comprising storage units, multipliers and adders; setting the 
storage units to a first binary value; repeating the following 
operations: i) computing a set of circulant patterns, ii) pro- 
viding each recursive convolutional encoder with a binary 
sequence of T message bits, each message bit sent to the 
output as a codeword symbol, and each message bit being 
multiplied with a circulant pattern, summed to the result of a 
previous multiplication, stored in a storage unit and shifted, 
until the T message bits have been encoded, until kT message 
bits have been encoded; and generating an output codeword 
by reading the contents of the storage units of the recursive 
convolutional encoders. 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 shows a protograph for an AR3A code. 

FIG. 2 shows a protograph for an AR4A code. 

FIG. 3 shows a parity check matrix for an AR4A code. 

FIG. 4 shows a block diagram of an AR4A encoder. 

FIG. 5 shows a parity check matrix for an AR3A code. 

FIG. 6 shows a block diagram of an AR3 A encoder. 
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FIG. 7 shows a systematic block-circulant generator matrix 
for the AR3A code. 

FIG. 8 shows a systematic block-circulant generator matrix 
for the AR4A code. 

5 FIG. 9 shows a hardware implementation of a quasicyclic 
encoder. 

FIG. 10 shows a hardware implementation of a quasicyclic 
encoder using feedback shift registers. 

to DETAILED DESCRIPTION 

1 . Introduction 

In this section, AR3A and AR4A codes will be introduced, 
15 protographs for the AR3A and AR4A codes will be shown, 
and block-circulant parity check matrixes for the AR3A and 
AR4A codes will be described. 

Throughout the present description, a circulant will be 
defined as a square binary matrix where each row is con- 
20 structed from the previous row by a single right cyclic shift. It 
will not be required that each row has Hamming weight 1 . 

An rTxnT parity check matrix H can be constructed by 
concatenating rxn sparse circulants of size TxT. The density 
of each circulant matrix is indicated by the corresponding 
25 value in an rxn base matrix Yi base . 

The Tanner graph corresponding to this matrix is called a 
protograph (see J. Thorpe, “Low-Density Parity-Check 
(LDPC) Codes Constructed from Protographs,” IPN Progress 
Report 42-154, JPL, August 2003). See also the “Related Art” 
30 section of U.S. patent application Ser. No. 11/166,040 for 
“ARA Type Protograph Codes”, incorporated herein by ref- 
erence in its entirety. Entries greater than 1 in the base matrix 
correspond to multiple edges in the protograph. Base matrices 
can be expanded into block-circulant LDPC codes by replac- 
35 ing each entry in H base , with a circulant containing rows of the 
specified Hamming weight. The resulting codes are quasicy- 
clic. Alternatively, they can be expanded into less structured 
codes by replacing each entry with a sum of arbitrary permu- 
tation matrices. 

40 AR3 A and AR4A codes are described in the applicant’ s 

patent application Ser. No. 11/166,040 for “ARA Type Pro- 
tograph Codes”, filed on the same day of the present applica- 
tion and incorporated herein by reference in its entirety. FIGS . 
1 and 2 of the present application show protographs for AR3 A 
45 and AR4 A codes and will be used as examples throughout the 
present description. Squares represent parity check nodes and 
circles represent variable nodes, where the black circles rep- 
resent transmitted symbols and the white circles represent 
punctured symbols. The designs of FIGS. 1 and 2 are derived 
50 from a three step encoding procedure: accumulate, repeat- 
by-3 (or 4), and accumulate shown by the Applicants in A. 
Abbasfar, D. Divsalar, and K. Yao, “Accumulate Repeat 
Accumulate Codes,” IEEE International Symposium on 
Information Theory, (Chicago, 111.), June 2004, and U.S. 
55 patent application Ser. No. 1 1/166,040 for “ARA Type Pro- 
tograph Codes”, both incorporated herein by reference in 
their entirety. Each protograph describes a 3x5 block-circu- 
lant parity check matrix, and the number of parallel edges 
shows the degree of the corresponding circulant. 

60 In practice, these protographs cannot be directly expanded 
into block-circulant codes without introducing low weight 
codewords, regardless of the choice of circulants. A practical 
solution is to expand the protographs twice, first with small 
permutation matrices, such as of size 4x4 or 8x8, and then 
65 with circulants to build the full code. The result is a parity 
check matrix such as the one shown in FIG. 3 for a very small 
AR4A code, where each nonzero entry in the matrix is rep- 
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resented by a dot. This code was constructed by putting the 
AR4A protograph variable nodes in the order (4, 2, 1, 5, 3) 
and check nodes in order (A, B, C) as demarcated by the solid 
lines, expanding with 4x4 permutations, and then expanding 
with 16x16 circulants. The resulting 12x20 block-circulant 
structure is emphasized by dotted lines. 

2. Iterative Encoders 

A description of a general method for LDPC encoding can 
be found in T. Richardson and R. Urbanke, “Efficient Encod- 
ing of Low-Density Parity -Check Codes,” IEEE Transactions 
on Information Theory, pp. 638-656, February 2001, incor- 
porated herein by reference in its entirety. The present section 
will describe a related encoding technique, called iterative 
encoding, that also take advantage of the block-circulant 
structure of the parity check matrix. 

An encoder for any (N,K) LDPC code can be built from an 
erasure correcting decoder. 

In accordance with the present disclosure, a set of K lin- 
early independent variable nodes are selected as the system- 
atic symbols, and these are initialized with the K information 
bits to be encoded. If there are no stopping sets, then the 
remaining N-K parity symbols are computed iteratively with 
the standard erasure correcting algorithm. Because the erased 
symbol positions are known a priori, the existence of stopping 
sets is also known. This method is equivalent to Richardson 
and Urbanke’ s low-complexity encoding algorithm when 
their variable g=0. However, differently from what shown in 
Richardson-Urbanke, the method according to the present 
disclosure is applied to block-circulant codes. 

If H has full rank R=N-K, and this iterative encoding 
method succeeds, then each of the N-K parity check equa- 
tions is solved exactly once to determine one of the N-K 
unknown parity symbols. For a check equation with d terms, 
d-2 exclusive-OR operations are required. Thus, iterative 
encoding requires exactly E-2R exclusive-OR operations, 
where E is the number of nonzero elements in H. For an 
arbitrary LDPC code, the scheduling of these computations 
can be complex; for block-circulant codes, they can be per- 
formed in well organized groups of T operations. The amount 
of memory required in such an encoder varies depending on 
the code structure; it is sufficient to store all N code symbols. 

The above process will be illustrated with the AR3A and 
AR4A code examples. AR3A and AR4A codes are accumu- 
late-repeat-accumulate codes with repetition 3 and 4, respec- 
tively, as described in Applicants’ patent application Ser. No. 
1 1/166,040 for “ARA Type Protograph Codes”, filed on the 
same day of the present application and incorporated herein 
by reference in its entirety 

2A. Iterative Encoder for AR4A Code 

When the rows and columns of the AR4A base matrix are 
reordered as (B, A, C) and (4, 2, 3, 1, 5), the following 
structure of the parity check matrix H is obtained: 


-2 3 10CT 
H= 00210 
.0 1 3 0 2. 

Iterative encoding begins by applying the kT=2T informa- 
tion symbols to the first two columns in the base matrix. The 
first row of T check equations can be solved in parallel to 
determine the third column of code symbols, and then the 
next row can be solved to determine the fourth column. The 2 
in the lower right comer means that each remaining check 
equation has two unknowns, and iterative encoding is halted 


by this stopping set. However, note that this parity check 
matrix is not lull rank: the sum of the first T and last T rows of 
H is the all-zero vector, independent of the circulants chosen. 
This means that one of the remaining T undetermined code 
5 symbols can be assigned an additional information bit, and 
iterative encoding now completes successfully, operating (in 
a permuted order) as an accumulator of length T. 

FIG. 4 shows a block diagram of an AR4A encoder per- 
forming the above described process steps. An input message 
io 10 comprises 2T input symbols s 0 and s x . 

Symbols s 0 are multiplied by a circulant matrix H^ Sym- 
bols s x are multiplied by matrix H 2 . The results are summed, 
producing the T untransmitted parity symbols, denoted p 0 , 
corresponding to the fifth column of H. The untransmitted 
15 parity symbols p 0 are then multiplied by matrix H 3 (upper 
right branch of FIG. 4) and matrix H 4 (lower right branch of 
FIG. 4). The matrix multiplication in the lower right branch of 
FIG. 4 computes the T parity symbols denoted p x , corre- 
sponding to the third column of H. The matrix multiply, 
20 permute (element 20), and accumulate (elements 31, 32) 
steps in the upper right branch of FIG. 4 compute T more 
parity symbols p 2 . Concatenating with the input message 10 
gives the systematic output codeword 40. 

As shown in FIG. 3, each row and each column of matrices 
25 H x and H 4 have Hamming weight 2, and each row and each 
column of matrices H 2 and H 3 have Hamming weight 3; these 
Hamming weights match the corresponding entries in the 
AR4A base matrix. 

30 2B. Iterative Encoder for AR3A Code 

The AR3A code shows somewhat different behavior. With 
the same row and column ordering, the AR3A base matrix is 

35 [22100' 

//= 0 12 10 
.0 1 2 0 2 . 

40 Foreseeing the problematic 2 in the lower right comer, one 

redundant check equation can be constructed by summing the 
last T rows of H to get the length N +P=5T vector, h=[0 r 1 T 0 T 
0 r 0 r ], where 0 r and 1 T represent strings of T zeros andT ones, 
respectively. This check equation shows that the first 2T vari- 
45 able nodes are not linearly independent, and cannot all be 
assigned information bits. Instead, information bits are 
assigned to the first 2T-1 and to the very last variable node. 
Iterative encoding begins with the constructed check equation 
h, which computes the 2T’th code symbol as the parity of the 
50 preceding T-l symbols. Iterative encoding then proceeds to 
completion exactly as for the AR4A code. 

The iterative encoding algorithm will be described picto- 
rially for a variation of the AR3 A code. In the AR3 A code of 
FIG. 1, variable node 4 is connected by two edges to check 
55 node B, and variable node 5 is also doubly connected to check 
node C. A very similar protograph can be constructed by 
crossing two of these edges, so nodes 4 and 5 are each singly 
connected to both B and C. When the columns are placed in 
order (1, 4, 5, 3, 2) and rows in order (A, B, C), the protograph 
60 is expanded first by a factor of 8, and then a second time with 
circulants of size 64x64, the result is the parity check matrix 
shown in FIG. 5. Key blocks are colored and labeled. By 
summing the bottom 2T=2M rows of the parity-check matrix 
in FIG. 5, we note that the 2T=2M code symbols s 0 , s x 
65 corresponding to the rightmost 2T=2M columns must have 
overall even parity. The encoder for this variation of the 
AR3A code assigns k-1 =2T-1 input message bits to the first 
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2T-1 positions of s 0 , s l5 and computes the T-thbit of s : as the 
parity of these 2T-1 message bits. The 2T-th message bit is 
used to initialize the state of the accumulator in FIG. 6. 

FIG. 6 shows the corresponding encoder architecture, 
where matrices III, 112, 113, U4+II5, and 116+117 correspond 5 
to those in FIG. 5. I and III through 117 are permutation 
matrices, so each row and each column of these matrices has 
Hamming weight 1 . The matrices are chosen so that the sums 
I+m, 114+115, and 116+117 consist entirely of rows and col- 
umns of weight 2. In FIG. 6, the last message bit of input to 
message 50 is replaced by the parity of the remaining message 
bits, and the discarded bit is used instead to initialize the 
accumulator 61, 62. On the top horizontal path through FIG. 

6, the puncture box 80 passes half of the k=2T bits s 0 , s x to 
serve as k/2-1 systematic symbols in the codeword. In other 15 
words, puncture box 80 discards s 0 and passes only s x . 

On the bottom horizontal path, a sparse matrix multiply 90 
is performed between the vector 100 of k message bits and a 
kxk/2 sparse matrix 110 comprised of matrices III and 112 of 
FIG. 5, to generate another k/2 codeword symbols p 0 . On the 20 
middle path, the k message bits 100 are multiplied by the 
sparse kxk matrix a of FIG. 5. The resulting k bits are re- 
ordered in a manner determined by n3 (see also FIG. 5), and 
then accumulated into a running sequence of outputs from an 
exclusive-OR gate. This forms the remaining k codeword 25 
symbols p 1? p 2 . 

Very similar algorithms are possible whenever the proto- 
graph can be lower triangularized as shown for the AR3A and 
AR4A codes, and the main diagonal consists exclusively of 
the numbers 1 and 2. Iterative encoders of this nature are not 30 
always possible. The AR34JA protograph (described later) 
cannot be lower-triangularized, and so iterative encoders of 
this style do not exist. Instead, the encoders described in the 
next section can be used. 

3. Encoders Using Block-Circulant Generator Matrices 

In the present section, the construction of systematic block- 
circulant generator matrices will be presented. 

The LDPC codes discussed here are defined by a block 
matrix H composed of circulants, and of size rTxnT, where 4Q 
r<n. A quasicyclic code is one for which a “quasicyclic shift” 
of a codeword is also a codeword. That is, if any codeword c 
is partitioned into binary strings of length T and each string is 
circularly shifted by the same amount, the resulting vector is 
also a codeword. It is immediate that any LDPC code defined 45 
by a block-circulant H matrix is quasicyclic. 

In some cases, such a code has a systematic generator 
matrix G of size (n-r)TxnT that is entirely composed of 
circulants. To construct such a generator matrix, the columns 
of H are sorted so that the kT symbols desired to be systematic 5Q 
appear first, followed by the remaining rT parity symbols. 
That is, let H=[Q S], where Q is of size rTxkT, and S is square 
and of size rTxrT. In general, if S is not full rank, then G 
cannot be quasicyclic. Otherwise, G is computed as 

G=p (n - r)I im T ] 55 

where I ( „_ r)r is the identity matrix of size (n-r)Tx(n-r)T. Not 
all block-circulant LDPC codes have block-circulant genera- 
tor matrices. As a particularly small example, suppose H is 
described by the single circulant with the first row [110 10 60 
0 0], and size 7x7. As noted above, this only has rank 4. One 
codeword is [1 1 10 10 0], and because the code is quasi- 
cyclic (in fact cyclic, because H consists of a single circulant), 
all cyclic shifts of this codeword are also codewords. How- 
ever, the circulant with the first row [1 1 10 10 0] only has 65 
rank 3, and so cannot be used in its entirety as a generator 
matrix. 


In the remainder of this section, reference will be made 
again to the AR3A and AR4A codes discussed earlier. 

3 A. Generator Matrix for AR3A 

The 3Tx5T parity check matrix for AR3A is full rank, and 
so a generator matrix for this code will have dimension 2T. 
The matrix H is partitioned into [Q S], where Q contains the 
columns to make systematic, and S is the square matrix of 
parity symbols that must be invertible. If Q is chosen to 
include the circulants corresponding to variable nodes 4 and 
2 in the protograph, as done for the iterative encoder, it can be 
found that S has rank rT-1, deficient by 1. This misfortune 
occurs because of the closed loop of degree-2 variable nodes 
created by protograph nodes 5 and C. 

Alternatively, one can choose to make protograph variable 
nodes 4 and 5 systematic. In this case, S has full rank, and a 
systematic block-circulant G can be calculated exactly as 
described. When this is done for the parity check matrix in 
FIG. 5, the result is the generator matrix shown in FIG. 7. An 
encoder that performs matrix multiplication by G is particu- 
larly suitable for hardware implementation as described in the 
next section. 

3B. Generator Matrix for AR4A 

The AR4A code will be now taken into consideration. For 
this code, there is no set of R columns that can be selected 
from H to form an invertible square matrix S, because H itself 
is rank deficient by 1 . Remarkably, these two defects cancel 
and the method for constructing G can proceed with minor 
modifications. Variable nodes 4 and 2 are selected to be sys- 
tematic, and when H is arranged to put these on the left, it 
appears as shown in FIG. 3. The left two fifths of H is the 
matrix Q, and the remaining square portion on the right is S. 
The equations are solved to find codewords of the form c 4 =[l 
0 2r _i p 1 p 5 p 3 ], and of the form c 2 =[0 r 1 0 r _ 1 p 1 p 5 p 3 ], where 
0; is a string of i zeros, and each p x is a binary string of length 
T. By expanding these solutions into circulants, a block- 
circulant “generator” matrix can be formed, 


G = 


c 4 


C 2 


of size 2Tx5T. This is one dimension short, and the missing 
codeword is c=[0 3r 1 T 0 r ] . Note that if c were expanded into 
circulants, the resulting TxnT matrix has rank 1. For imple- 
mentation, it is preferable to use this G as the generator matrix 
and discard the one additional dimension in the code, accept- 
ing the miniscule performance loss. The generator matrix G, 
corresponding to the parity check matrix of FIG. 3, is shown 
in FIG. 8. Because the last T code symbols are punctured, the 
rightmost columns of circulants would be deleted from G in 
implementation. By design, the first two columns of circu- 
lants form an identity matrix; the remaining circulants could 
have been dense by the construction algorithm, but the AR4A 
protograph structure assures that many remain sparse. 

3C. Generator Matrix for AR34JA 
A third example is the AR34JA codes, built from the pro- 
tograph, 


H = 


0 0 0 1 2 
2 2 10 1 
1110 2 
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These codes do not have an iterative encoder of the form 
described earlier, because H cannot be lower triangularized. 
However, quasicyclic encoders do exist. It is not hard to show 
that the two columns of H chosen for systematic symbols 
must be one of the first two (identical) columns, and either the 
third or the fifth column. For these choices, a quasicyclic 
encoder can be constructed in the usual way, just as for the 
AR3 A code described earlier. Any other choice fails to yield 
a quasicyclic encoder, because it results in a rank-deficient 
sub -matrix S that cannot be inverted. 

4. Software Implementation 

The iterative encoders described so far can often be imple- 
mented efficiently in software. This is because the computa- 
tions can be performed in parallel, operating on T symbols at 
a time. Moreover, the use of circulants means that the required 
reordering of symbols is typically minimal, unlike the situa- 
tion with more general pennutations. Preliminary results 
from a software implementation of this algorithm finds that it 
runs at 90 Kbps/MHz for several rate 1/2 AR3A codes. In 
particular, the software encoder runs at 128 Mbits/sec on a 
1 .42 GHz Macintosh, and can be expected to run at something 
like 1.8 Mbits/sec on a 20 MHz RAD6000 spaceflight quali- 
fied microprocessor. 

5 . Hardware Implementation 

The systematic block-circulant generator matrices devel- 
oped in the previous sections are particularly amenable to 
hardware implementation. A hardware encoder can pass the 
kT message bits to the output as code symbols, while inter- 
nally performing a multiplication by the (dense) kx(n-k) 
matrix in the right hand portion of G. The resulting vector 
serves as the remaining (n-k)T code symbols. A direct imple- 
mentation of this dense matrix multiplication is shown in 
FIG. 9, as proposed in S. Lin, “Quasi -Cyclic LDPC Codes.” 
CCSDS working group white paper, October 2003. The set of 
n-k cyclic shift registers at the top of the figure, each of length 
T, are loaded with the circulant patterns for the first row of G. 
For each message bit m z in turn, these registers are cycled 
once and, if m 7 =l , exclusive-ORed with the n-k symbol out- 
put register. When each row of circulants is completed, 
sequences for the next row of circulants in G are loaded into 
the shift registers. 

In accordance with the present disclosure, an improvement 
in the hardware encoder is to cyclicly shift the output register, 
rather than the circulant registers, as shown in FIG. 10. 

The hardware encoder of FIG. 10 comprises one-bit stor- 
age units 200 shown as squares, one-bit multipliers (logical 
AND gates) 210 shown as circled crosses, one-bit adders 
(logical Exclusive OR gates) 220 shown as circled plusses, 
and switches 230, 240. Many of these are organized into 
structures 250, 260 known as Recursive Convolutional 
Encoders (RCEs), as shown. For a block-circulant generator 
matrix G of size kT xnT, this encoder comprises primarily n-k 
RCEs, each of length T. 

The encoder is initialized by setting all the storage units 
200 within the RCEs to zero, and setting the n-k switches 230 
as shown in the figure. Setting of the switches 230 as shown in 
FIG. 10 allows the contents of the last storage unit of each 
RCE to be fed back to the first adder of that RCE. Then 
encoding is performed in a bit-serial fashion, T bits at a time. 
Before encoding the first T bits, each RCE is provided with a 
binary sequence of length T via the incoming arrows shown 
270 along its top edge 280. This binary sequence could be 
computed and placed in the collection of (n-k)T storage units 
190 shown by the row of boxes across the top of FIG. 10. 
Alternatively, the binary sequence could be provided directly 
by combinatorial logic driven by a message bit counter. These 
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binary sequences describe the circulants of the generator 
matrix, and so are called circulant patterns. The first set of 
circulant patterns used is given by the first row of the genera- 
tor matrix G, and it is stored from right to left in the boxes 190 
5 across the top of FIG. 10. 

The first T message bits are encoded sequentially as fol- 
lows. The first bit is sent directly to the output as the first 
codeword symbol. Simultaneously, the sequence stored in 
each RCE 250, 260 is either Exclusive ORed with the corre- 
10 sponding circulant pattern (if the message bit is a 1 ) or taken 
unmodified (if the message bit is a 0), and is right circularly 
shifted one position. Then the second bit is encoded the same 
way: it is taken as the second codeword symbol, and simul- 
taneously determines whether the circulant patterns are 
15 Exclusive ORed with the contents of each RCE, before the 
next right circular shift. This process is repeated until T mes- 
sage bits have been encoded. 

Before encoding the next T message bits, a new set of 
circulant patterns are computed and provided to the RCEs. 
20 Then message bits T+l through 2T are encoded by condi- 
tional Exclusive OR operations and right circular shifts. 
These steps are repeated until all kT message bits have been 
encoded. 

To complete generating the codeword, all n-k switches are 
changed to the opposite position from that shown, and all 
circulant patterns are set to zero. The contents of the n-k 
RCEs are then sequentially read out via right shifts as the 
parity portion of the codeword . 

30 In other words, when the output switch 240 is set as drawn 

in FIG. 10, all kT input message bits go straight through the 
RCEs unchanged, and serve as output codeword symbols. 
Simultaneously, each message bit is multiplied by a circulant 
pattern, and the result is added to the shifted register contents. 
35 Then, the switch 240 is flipped and (because the other (n-k- 
1) switches 230 are flipped too) the daisy-chain of all (n-k)T 
registers in the RCEs are read out sequentially. The result is a 
codeword of length nT, of which the first kT symbols are just 
a copy of the input message (i.e. the encoder of FIG. 10 is a 
40 systematic encoder). 

For each set of T message bits input through the input line, 
the circulant pattern is generally different for each of the 
RCEs. For example, FIG. 8 shows the block-circulant gen- 
erator matrix for an AR4A code. The first 128 columns of the 
45 matrix of FIG. 8 just have a diagonal line: this is an identity 
matrix that passes the input bits to the output unchanged. The 
remaining 1 92 columns comprise an 8x 12 array of circulants, 
each of size T=16. Except for the occasional coincidence, all 
96 circulants are different. 

50 This encoder has been implemented in hardware. It 
requires n-k D-latches, n-k exclusive-OR gates, and a mod- 
est amount of additional combinatorial logic. The size 
(k=1024, n=2048) LDPC code fits comfortably in a Xilinx 
XC3S200 Spartan Field Programmable Gate Array (FPGA), 
55 and runs at 100 Msymbols/second. Speed is determined by 
the maximum clock rate of the FPGA. The maximum sup- 
ported code size is determined primarily by the number of 
D-latches required to accumulate the parity, and so scales 
linearly with n-k. 

60 

5. Conclusion 

As many research groups have discovered in the last couple 
years, block-circulant LDPC codes have well structured 
decoders, and offer excellent error correction performance 
65 when designed carefully. The Applicants have shown in the 
present disclosure that block-circulant LDPC codes possess 
attractive encoders as well, of a couple different forms. 
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An iterative encoder is often possible for block-circulant 
LDPC codes, based on the standard erasure correction algo- 
rithm. Due to the circulant structure of the parity check 
matrix, the computational steps are typically sparse matrix 
multiplication by a circulant, permutation, and modulo-2 5 
accumulation. The circulant matrix multiplications operate 
on long strings of sequential bits, so parallel computations are 
practical and permit fast encoders. 

Encoders composed of linear feedback shift registers are 
another attractive alternative for block-circulant LDPC 10 
codes. These are based on the block-circulant generator 
matrices that these LDPC codes often possess. Such an 
encoder requires remarkably little hardware, and provides a 
fast, simple, bit-serial architecture. The Applicants have 
implemented these decoders in a small FPGA operating at 15 
100 Msymbols/second. 

The encoders and encoding methods disclosed herein are 
applicable to a wide range of communication problems. They 
would be of interest to any application that requires the excel- 
lent performance that LDPC codes provide, and that would 20 
benefit from low -complexity LDPC encoders. Examples 
include communications systems onboard spacecraft in deep 
space or in orbit around the earth, digital encoders in cellular 
telephones, encoders within data storage devices such as hard 
disk drives and magnetic tape recorders, and encoders within 25 
computer modems. 

While several illustrative embodiments of the invention 
have been shown and described in the above description, 
numerous variations and alternative embodiments will occur 
to those skilled in the art. Such variations and alternative 30 
embodiments are contemplated, and can be made without 
departing from the scope of the invention as defined in the 
appended claims. 

What is claimed is: 

1 . An encoding apparatus to encode message input symbols 
in accordance with an accumulate-repeat-accumulate code 
with repetition four, the apparatus comprising: 

a first multiplier to multiply a first portion of the input 
symbols with a first matrix, forming first intermediate 4Q 
symbols; 

a second multiplier to multiply a second portion of the 
input symbols with a second matrix, forming second 
intermediate symbols; 

a first adder to sum the first intermediate symbols with the 45 
second intermediate symbols, forming third intermedi- 
ate symbols; 

a third multiplier to multiply the third intermediate sym- 
bols with a third matrix, forming fourth intermediate 
symbols; 50 

a fourth multiplier to multiply the third intermediate sym- 
bols with a fourth matrix, forming a first set of output 
symbols; 

a second adder to sum the fourth intermediate symbols 
with the second portion of the input symbols, forming 55 
fifth intermediate symbols; 

a permuter to permute the fifth intermediate symbols, 
forming permuted symbols; and 

an accumulator to accumulate the permuted symbols, 
forming a second set of output symbols. 60 

2. The apparatus of claim 1, wherein the first portion of the 
input symbols corresponds to a first half of the input symbols 
and the second portion of the input symbols corresponds to a 
second half of the input symbols. 

3. The apparatus of claim 1, wherein output symbols are 65 
obtained by combining the first set of output symbols, the 
second set of output symbols, and the input symbols. 
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4. The apparatus of claim 1, wherein each row of the first 
matrix has Hamming weight 2, each row of the second matrix 
has Hamming weight 3, each row of the third matrix has 
Hamming weight 3, and each row of the fourth matrix has 
Hamming weight 2. 

5. The apparatus of claim 1, wherein the first matrix, sec- 
ond matrix, third matrix and fourth matrix are block-circulant 
matrices. 

6. A method for encoding message input symbols in accor- 
dance with an accumulate-repeat-accumulate code with rep- 
etition four, comprising: 

multiplying a first portion of the input symbols with a first 
matrix, forming first intermediate symbols; 
multiplying a second portion of the input symbols with a 
second matrix, forming second intermediate symbols; 
adding the first intermediate symbols to the second inter- 
mediate symbols, forming third intermediate symbols; 
multiplying the third intermediate symbols with a third 
matrix, forming fourth intermediate symbols; 
multiplying the third intermediate symbols with a fourth 
matrix, forming a first set of output symbols; 
adding the fourth intermediate symbols with the input sym- 
bols, forming fifth intermediate symbols; 
permuting the fifth intermediate symbols, forming per- 
muted symbols; and 

accumulating the permuted symbols, forming a second set 
of output symbols. 

7. The method of claim 6, wherein output symbols are 
obtained by combining the first set of output symbols, the 
second set of output symbols, and the input symbols. 

8. The method of claim 6, wherein the first portion of the 
input symbols corresponds to a first half of the input symbols 
and the second portion of the input symbols corresponds to a 
second half of the input symbols. 

9. The method of claim 6, wherein each row of the first 
matrix has Hamming weight 2, each row of the second matrix 
has Hamming weight 3, each row of the third matrix has 
Hamming weight 3, and each row of the fourth matrix has 
Hamming weight 2. 

10. The method of claim 6, wherein the first matrix, second 
matrix, third matrix and fourth matrix are block-circulant 
matrices. 

11. An encoding apparatus to encode message input sym- 
bols in accordance with an accumulate-repeat-accumulate 
code with repetition three, the apparatus comprising: 

a puncturing device, puncturing k input symbols and out- 
putting k/2 input symbols, forming a first set of output 
symbols; 

a first multiplier to multiply the k input symbols with a first 
matrix, forming first intermediate symbols; 
a second multiplier to multiply the k input symbols with a 
second matrix, forming a second set of output symbols; 
a permuter to permute the first intermediate symbols, form- 
ing permuted symbols; and 

an accumulator to accumulate the permuted symbols, 
forming a third set of output symbols. 

12. The apparatus of claim 11, wherein an output codeword 
is obtained by combining the first, second and third sets of 
output symbols. 

13. The apparatus of claim 11, wherein the first matrix is a 
block matrix of size kxk, comprising two on-diagonal sub- 
matrices each of size k/2 xk/2 and two off-diagonal sub- 
matrices each of size k/2 xk/2, and the second matrix is of size 
kxk/2, k being an integer. 

14. The apparatus of claim 13, wherein the two on-diagonal 
sub -matrices are zero and the two off-diagonal sub -matrices 
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each have rows of Hamming weight 2, and the second matrix 
has k/2 rows with Hamming weight 1 and k/2 rows with 
Hamming weight 2. 

15. The apparatus of claim 11, wherein the first matrix and 

the second matrix are block-circulant matrices. 5 

16. A method for encoding message input symbols in 
accordance with an accumulate-repeat-accumulate code with 
repetition three, comprising: 

puncturing k input symbols and outputting k/2 input sym- 
bols, forming a first set of output symbols, k being an to 
integer; 

multiplying the k input symbols with a first matrix, forming 
first intermediate symbols; 

multiplying the k input symbols with a second matrix, 
forming a second set of output symbols; 15 

permuting the first intermediate symbols, forming per- 
muted symbols; and 

accumulating the permuted symbols, forming a third set of 
output symbols. 

17. The method of claim 16, further comprising combining 20 
the first, second and third sets of output symbols to obtain an 
output codeword. 

18. The method of claim 16, wherein the first matrix is a 
block matrix of size kxk, comprising two on-diagonal sub- 
matrices each of size k/2 xk/2 and two off-diagonal sub- 25 
matrices each of size k/2 xk/2, and the second matrix is of size 
kxk/2. 

19. The method of claim 18, wherein the two on-diagonal 
sub-matrices are zero and the two off-diagonal sub-matrices 
each have rows of Hamming weight 2, and the second matrix 30 
has k/2 rows with Hamming weight 1 and k/2 rows with 
Hamming weight 2. 

20. The method of claim 16, wherein the first and second 
matrix are block-circulant matrices. 

21. An encoding apparatus to encode input symbols in 35 
accordance with a block-circulant low density parity check 
(LDPC) code, the apparatus comprising: 

a plurality of recursive convolutional encoders, each recur- 
sive convolutional encoder comprising storage units, 
multipliers and adders to encode the input symbols; and 40 

a plurality of circulant patterns to be fed to the recursive 
convolutional encoders, one set of patterns for each 
recursive convolutional encoder. 

22. The apparatus of claim 21, wherein the recursive 
encoders further comprise switches, each switch having a first 


14 

condition where contents of a rightmost storage unit of a 
recursive convolutional encoder are sent back to that recur- 
sive convolutional encoder and a second condition where 
contents of a rightmost storage unit are sent towards the 
output. 

23. The apparatus of claim 21, wherein output symbols are 
obtained by combining the input symbols with the contents of 
the storage units within the recursive convolutional encoders. 

24. A method for encoding input symbols in accordance 
with a block-circulant LDPC code, comprising: 

providing a plurality of recursive convolutional encoders, 
each recursive convolutional encoder comprising stor- 
age units, multipliers and adders; 

setting the storage units to a first binary value; 

repeating the following operations: 

i) computing a set of circulant patterns, 

ii) providing each recursive convolutional encoder with a 
binary sequence of T message bits, each message bit sent 
to the output as a codeword symbol, and each message 
bit being multiplied with a circulant pattern, summed to 
the result of a previous multiplication, stored in a storage 
unit and shifted, until the T message bits have been 
encoded, 

until kT message bits have been encoded; and 

generating an output codeword by reading the contents of 
the storage units of the recursive convolutional encod- 
ers, 

wherein k and T are integers. 

25. The method of claim 24, wherein the binary sequences 
are provided through an additional plurality of storage units. 

26. The method of claim 24, wherein the binary sequences 
are provided by combinatorial logic. 

27. The method of claim 24, wherein the binary sequences 
correspond to circulant patterns of a generator matrix. 

28. The method of claim 24, wherein the plurality of recur- 
sive convolutional encoders comprises n-k convolutional 
encoders, n being an integer, wherein n-k-1 recursive convo- 
lutional encoders comprise a switch, the switch having a first 
condition where the contents of the rightmost storage unit of 
a recursive convolutional encoder are sent to the leftmost 
adder of that recursive convolutional encoder, and a second 
condition where the contents of the rightmost storage unit of 
a recursive convolutional encoder are sent towards the output. 



